A Salience-Based Approach to Gesture-Speech Alignment

نویسندگان

  • Jacob Eisenstein
  • C. Mario Christoudias
چکیده

One of the first steps towards understanding natural multimodal language is aligning gesture and speech, so that the appropriate gestures ground referential pronouns in the speech. This paper presents a novel technique for gesture-speech alignment, inspired by saliencebased approaches to anaphoric pronoun resolution. We use a hybrid between data-driven and knowledge-based mtehods: the basic structure is derived from a set of rules about gesture salience, but the salience weights themselves are learned from a corpus. Our system achieves 95% recall and precision on a corpus of transcriptions of unconstrained multimodal monologues, significantly outperforming a competitive baseline.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

On Factoring Out a Gesture Typology from the Bielefeld Speech-and-Gesture-Alignment Corpus (SAGA)

People communicate multimodally. Most prominently, they co-produce speech and gesture. How do they do that? Studying the interplay of both modalities has to be informed by empirically observed communication behavior. We present a corpus built of speech and gesture data gained in a controlled study. We describe 1) the setting underlying the data; 2) annotation of the data; 3) reliability evaluti...

متن کامل

Coexpressivity of speech and gesture: Lessons for models for aligned speech and gesture production

When people combine language and gesture to convey their intended information, both modalities are characterized by an intriguing degree of coherence and consistency. For developing an account how speech and gesture are aligned to each other, one question of major importance is how meaning is distributed across the two channels. In this paper, we start from recent empirical findings indicating ...

متن کامل

Toward Natural Gesture/Speech Control of a Large Display

In recent years because of the advances in computer vision research, free hand gestures have been explored as means of human-computer interaction (HCI). Together with improved speech processing technology it is an important step toward natural multimodal HCI. However, inclusion of non-predefined continuous gestures into a multimodal framework is a challenging problem. In this paper, we propose ...

متن کامل

Human-Content and Gesture-Event Video Coding

Currently, bandwidth limitations pose a major challenge for delivering high-quality multimedia information to users. In this research, we aim to provide a better compression of human-centered video sequences such as lectures, monologues, and presentations. Based on the idea that people pay more attention to face and hand regions in videos containing people speaking, our approach encodes those r...

متن کامل

Applying Pattern-based Classification to Sequences of Gestures

The pattern-based sequence classification system (PBSC) identifies regularly occurring patterns in collections of sequences and uses these patterns to predict meta-information. This automated system has been proven useful in identifying patterns in written language and musical notations. To illustrate the wide applicability of this approach, we classify symbolic representations of speech-accomp...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004